Evaluating the Accuracy of scite, a Smart Citation Index





machine learning, electronic resources, citation indices


Objectives: Citations do not always equate endorsement, therefore it is important to understand the context of a citation. Researchers may heavily rely on a paper they cite, they may refute it entirely, or they may mention it only in passing, so an accurate classification of a citation is valuable for researchers and users. While AI solutions have emerged to provide a more nuanced meaning, the accuracy of these tools has yet to be determined. This project seeks to assess the accuracy of scite in assessing the meaning of citations in a sample of publications.

Methods: Using a previously established sample of systematic reviews that cited retracted publications, we conducted known item searching in scite, a tool that uses machine learning to categorize the meaning of citations. scite’s interpretation of the citation’s meaning was recorded, as was our assessment of the citation’s meaning. Citations were classified as mentioning, supporting or contrasting. Recall, precision, and f-measure were calculated to describe the accuracy of scite’s assessment in comparison to human assessment.

Results: From the original sample of 324 citations, 98 citations were classified in scite. Of these, scite found that 2 were supporting and 96 were mentioning, while we determined that 42 were supporting, 39 were mentioning, and 17 were contrasting. Supporting citations had high precision and low recall, while mentioning citations had high recall and low precision. F-measures ranged between 0.0 and 0.58, representing low classification accuracy.

Conclusions: In our sample, the overall accuracy of scite’s assessments was low. scite was less able to classify supporting and contrasting citations, and instead labeled them as mentioning. Although there is potential and enthusiasm for AI to make engagement with literature easier and more immediate, the results generated from AI differed significantly from the human interpretation.


Garfield E. Can Citation Indexing be Automated? Stat Assoc Methods Mech Doc Symp Proc. 1965;269:189–92. doi:10.1038/227669a0

White HD. Citation Analysis and Discourse Analysis Revisited. Appl Linguist. 2004 Mar 1;25(1):89–116. doi:10.1093/applin/25.1.89

Blaizot A, Veettil SK, Saidoung P, Moreno-Garcia CF, Wiratunga N, Aceves-Martins M, et al. Using artificial intelligence methods for systematic review in health sciences: A systematic review. Res Synth Methods. 2022 May;13(3):353–62. doi:10.1002/jrsm.1553

Opscidia - Plateforme pour acc´el´erer la veille technologique [Internet]. 2022 [cited 2023 Feb 10]. Available from: https://www.opscidia.com/

Nicholson JM, Mordaunt M, Lopez P, Uppala A, Rosati D, Rodrigues NP, et al. scite: A smart citation index that displays the context of citations and classifies their intent using deep learning. Quant Sci Stud. 2021 Nov 5;2(3):882–98. doi:10.1162/qssa00146

scite. Where do you get your articles from? [Internet]. scite. [cited 2022 Jun 28]. Available from: https://help.scite.ai/en-us/article/where-do-you-get-your-articles-from-1vglydm/

Brown SJ, Bakker CJ, Theis-Mahon NR. Retracted publications in pharmacy systematic reviews. J Med Libr Assoc. 2022 Feb 11;110(1):47–55. doi:10.5195/jmla.2022.1280

Moylan EC, Kowalczuk MK. Why articles are retracted: a retrospective cross-sectional study of retraction notices at BioMed Central. BMJ Open. 2016;6(11):e012047. Published 2016 Nov 23. doi:10.1136/bmjopen-2016-012047

Wager E, Williams P. Why and how do journals retract articles? An analysis of Medline retractions 1988-2008. J Med Ethics. 2011;37(9):567-570. doi:10.1136/jme.2010.040964

Nair S, Yean C, Yoo J, Leff J, Delphin E, Adams DC. Reasons for article retraction in anesthesiology: a comprehensive analysis.Raisons justifiant la r´etractation d’un article en anesth´esiologie: une analyse exhaustive. Can J Anaesth. 2020;67(1):57-63. doi:10.1007/s12630-019-01508-3

Bozzo A, Bali K, Evaniew N, Ghert M. Retractions in cancer research: a systematic survey. Res Integr Peer Rev. 2017;2:5. Published 2017 May 12. doi:10.1186/s41073-017-0031-1

Chauvin A, De Villelongue C, Pateron D, Yordanov Y. A systematic review of retracted publications in emergency medicine. Eur J Emerg Med. 2019;26(1):19-23. doi:10.1097/MEJ.0000000000000491

Garmendia CA, Nassar Gorra L, Rodriguez AL, Trepka MJ, Veledar E, Madhivanan P. Evaluation of the Inclusion of Studies Identified by the FDA as Having Falsified Data in the Results of Meta-analyses: The Example of the Apixaban Trials [published correction appears in JAMA Intern Med. 2021 Mar 1;181(3):409]. JAMA Intern Med. 2019;179(4):582-584. doi:10.1001/jamainternmed.2018.7661

Cochrane Library. Managing potentially problematic studies [Internet]. Cochrane Database of Systematic Reviews: editorial policies. [cited 2023 Feb 10]. Available from: https://www.cochranelibrary.com/cdsr/editorial-policiesproblematic-studies

Higgins JPT, Lasserson T, Chandler J, Tovey D, Thomas J, Flemyng E, et al. MECIR Manual [Internet]. London, England: Cochrane; 2022 Feb [cited 2023 Feb 10]. Available from: https://community.cochrane.org/mecir-manual

Lefebvre C, Glanville J, Briscoe S, Featherstone R, Littlewood A, Marshall C, et al. 4.S1 Technical Supplement to Chapter 4: Searching for and selecting studies. In: Higgins JPT, Thomas J, Chandler J, Cumpston M, Li T, Page M, et al., editors. Cochrane Handbook for Systematic Reviews of Interventions Version 63 [Internet]. Chichester, UK: John Wiley Sons, Ltd; 2022 [cited 2023 Feb 10]. Available from: https://training.cochrane.org/handbook/current/chapter-04-technical-supplementsearching-and-selecting-studies

Center for Scientific Integrity. Retraction Watch Database [Internet]. [cited 2022 Jul 21]. Available from: http://retractiondatabase.org/RetractionSearch.aspx?

R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria: R Foundation for Statistical Computing; 2022. Available from: https://www.R-project.org/

Rebala G, Ravi A, Churiwala S. Classification. In: An Introduction to Machine Learning [Internet]. 2019 [cited 2022 Jun 28]. p. 57–67. Available from: https://doi.org/10.1007/978-3-030-15729-6

van de Schoot R, de Bruin J, Schram R, Zahedi P, de Boer J, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nat Mach Intell. 2021 Feb;3(2):125–33. doi:10.1038/s42256-020-00287-7

Marshall IJ, Wallace BC. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Syst Rev. 2019 Jul 11;8(1):163. doi:10.1186/s13643-019-1074-9

Zhang Y, Liang S, Feng Y, Wang Q, Sun F, Chen S, et al. Automation of literature screening using machine learning in medical evidence synthesis: a diagnostic test accuracy systematic review protocol. Syst Rev. 2022 Jan 15;11(1):11. doi:10.1186/s13643-021-01881-5

Brody S. Scite. J Med Libr Assoc. 2021 Nov 22;109(4):707–10. doi:10.5195/jmla.2021.1331

scite for Publishers [Internet]. scite.ai. [cited 2022 Nov 12]. Available from: https://scite.ai

Wallace BC, Noel-Storr A, Marshall IJ, Cohen AM, Smalheiser NR, Thomas J. Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach. J Am Med Inform Assoc. 2017 Nov 1;24(6):1165–8. doi:10.1093/jamia/ocx053

O’Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Syst Rev. 2015 Jan 14;4(1):5. doi:10.1186/2046-4053-4-5

Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak. 2010 Sep 28;10(1):56. doi:10.1186/1472-6947-10-56

Marshall IJ, Kuiper J, Wallace BC. RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials. J Am Med Inform Assoc. 2016 Jan 1;23(1):193–201. doi:10.1093/jamia/ocv044




How to Cite

Bakker, C., Theis-Mahon, N., & Brown, S. J. (2023). Evaluating the Accuracy of scite, a Smart Citation Index. Hypothesis: Research Journal for Health Information Professionals, 35(2). https://doi.org/10.18060/26528