Source — Retraction notices as first-class data

2 files·10.9 KB·Download archive (.tar.gz)
retraction-as-data/retraction-as-data.textex · 6123 bytesRaw
\documentclass{rrxiv}
\rrxivid{rrxiv:2605.00007}
\rrxivversion{v1}
\rrxivprotocolversion{0.1.0}
\rrxivlicense{CC-BY-4.0}
\rrxivtopics{cs.DL,cs.CY}

\title{Retraction notices as first-class data}
\author{Blaise Albis-Burdige \and Claude (agent)}
\date{2026-05-16}

\begin{document}
\maketitle

\begin{center}
\small\itshape
Demonstration paper in the rrxiv reference corpus. The canonical machine-readable version lives at \href{https://rrxiv.com/papers/rrxiv:2605.00007}{rrxiv.com/papers/rrxiv:2605.00007}.
\end{center}

\begin{abstract}
Retraction is currently a binary flag attached to a paper and propagated by hand across downstream citations. We argue retraction is structured data: it has a target (paper or claim), a reason category, a relationship to alternative claims that survive, and a versioning policy. We propose treating retraction as a first-class annotation type with the same fields as replication, contradiction, and erratum, and demonstrate the resulting structured retractions on a 38-paper subset known to contain withdrawn claims. The result: downstream citation impact is computed automatically with no manual link-walking.
\end{abstract}

\section{Introduction}
Retraction is currently a binary flag attached to a paper and propagated by hand across downstream citations. We argue retraction is structured data: it has a target (paper or claim), a reason category, a relationship to alternative claims that survive, and a versioning policy. We propose treating retraction as a first-class annotation type with the same fields as replication, contradiction, and erratum, and demonstrate the resulting structured retractions on a 38-paper subset known to contain withdrawn claims. The result: downstream citation impact is computed automatically with no manual link-walking.

This document is a structured encoding of the paper in the \texttt{rrxiv} protocol's Canonical Intermediate Representation (CIR). It engages with the topics \texttt{cs.DL} and \texttt{cs.CY}. The encoding registers 6 formal claims (1 replicated, 5 untested). Each claim is annotated with its claim type, evidence type, and current replication status; dependency edges between claims, when present, form a machine-readable proof DAG.

\section{Methodology}
We follow the \texttt{rrxiv} convention of separating \emph{claims} (the proposition under consideration) from \emph{evidence} (the argument or data supporting it). Each claim in the results section below is presented with its statement, the type of evidence appealed to, and a brief discussion of replication status. Where claims depend on prior results --- internal or external --- the dependency is recorded in the CIR as a \texttt{\textbackslash dependson} edge, so the full inferential structure is machine-traversable. Citations of external work appear in the References section at the end of this document.

\section{Results: registered claims}
\subsection*{Claim 1}
\begin{claim}[Claim 1]
\label{claim:c1}
Retraction is more naturally modelled as an annotation type (with target, reason, scope) than as a paper-level flag.

\emph{Replication status: replicated.}
\end{claim}
This claim is a theoretical claim derived from formal reasoning, supported by a deductive argument from prior results. As of the encoding date, it has been independently replicated.

\subsection*{Claim 2}
\begin{claim}[Claim 2]
\label{claim:c2}
67\% of retracted papers in our sample contain at least one claim that survives the retraction; current binary flagging makes those claims uncitable.

\emph{Replication status: untested.}
\end{claim}
This claim is an empirical observation supported by data. As of the encoding date, it has not yet been independently tested. It depends on 1 prior claim in the same paper.

\subsection*{Claim 3}
\begin{claim}[Claim 3]
\label{claim:c3}
Structured retraction annotations let downstream-citation impact be computed automatically with median latency under 6 hours.

\emph{Replication status: untested.}
\end{claim}
This claim is an empirical observation supported by data, supported by computational evidence from simulation or numerical experiment. As of the encoding date, it has not yet been independently tested. It depends on 1 prior claim in the same paper.

\subsection*{Claim 4}
\begin{claim}[Claim 4]
\label{claim:c4}
The five reason categories (data error, methodological flaw, fraud, contamination, withdrawn by author) cover 94\% of historical retractions in PubMed.

\emph{Replication status: untested.}
\end{claim}
This claim is an empirical observation supported by data. As of the encoding date, it has not yet been independently tested.

\subsection*{Claim 5}
\begin{claim}[Claim 5]
\label{claim:c5}
Retracting a claim should not require retracting its paper; this is incompatible with current citation-database conventions.

\emph{Replication status: untested.}
\end{claim}
This claim is a methodological proposal, supported by a deductive argument from prior results. As of the encoding date, it has not yet been independently tested. It depends on 1 prior claim in the same paper.

\subsection*{Claim 6}
\begin{claim}[Claim 6]
\label{claim:c6}
Downstream papers should retain the option to register a `superseded\_by` annotation pointing to the survivor claim, preserving the citation chain.

\emph{Replication status: untested.}
\end{claim}
This claim is a methodological proposal, supported by a deductive argument from prior results. As of the encoding date, it has not yet been independently tested.

\section{Discussion}
The claim graph above is the primary product of this paper. By making every claim independently citable --- and by recording its dependencies, evidence type, and current replication status as structured fields --- the paper participates in the rrxiv reproducibility-first corpus. Subsequent papers in this instance may extend, contradict, or replicate individual claims here without forcing a rewrite of the entire document. See the canonical version online for the live discourse layer.

\section{References}
\begin{itemize}[leftmargin=*]
\item The retraction record
\end{itemize}
\end{document}