Verify academic citations against 240M+ papers across 4 databases.
CiteChecker detects potentially hallucinated or fabricated references in academic manuscripts — a growing concern with AI-assisted writing.
- Extracts references from text,
.docx,.pdf, or.mdfiles - Checks each citation against CrossRef, PubMed, Semantic Scholar, and OpenAlex
- Reports which references are verified (with DOI links) and which are not found (potentially fabricated)
pip install cite-checker# Check a manuscript file
citecheck check manuscript.docx
# Check from text
citecheck check --text "Smith J, et al. Fake Paper Title. Nature. 2024;123:45-67."
# Set similarity threshold
citecheck check manuscript.pdf --threshold 0.75from citecheck import check_references, extract_references
# Extract references from text
refs = extract_references(text)
# Check each reference
results = check_references(refs)
for r in results:
print(f"{r['status']}: {r['reference']}")
if r['doi']:
print(f" DOI: {r['doi']}")- Extract — Parses reference lists from manuscripts (supports APA, Vancouver, numbered styles)
- Query — Searches each reference across 4 academic databases using fuzzy matching
- Score — Calculates similarity between the cited reference and database results
- Report — Flags references below the similarity threshold as potentially fabricated
| Database | Coverage |
|---|---|
| CrossRef | 150M+ works, DOI resolution |
| PubMed | 36M+ biomedical citations |
| Semantic Scholar | 200M+ papers, all fields |
| OpenAlex | 240M+ works, open metadata |
- Books and book chapters may not be indexed in journal databases — manual verification recommended
- Very recent papers (< 1 week old) may not yet be indexed
- Non-English titles may have lower match rates due to transliteration
- Similarity threshold can be adjusted to reduce false positives
A free web interface is available at researchcheck.streamlit.app — part of the Research Integrity Checker suite.
MIT — see LICENSE
Issues and PRs welcome. Please include test cases for new features.
If you use CiteChecker in your research:
@software{citechecker2026,
title={CiteChecker: Automated Citation Verification},
author={Tran, Tuyen},
year={2026},
url={https://github.com/tuyentran-md/cite_checker}
}