The Non-trivial Effects of Trivial Errors in Scientific Communication and Evaluation

von Terje Tüür-Fröhlich

Aug. 2016, Hardc./Fadenheft., 164 S.

ISBN 978-3-86488-104-6, 24,80 € (D)

[Zugleich: Diss., Johannes Kepler Universität Linz, 2014]
[Text englisch]


Thomson Reuters’ citation indexes i.e. SCI, SSCI and AHCI are said to be “authoritative”. Due to the huge influence of these databases on global academic evaluation of productivity and impact, Terje Tüür-Fröhlich decided to conduct case studies on the data quality of Social Sciences Citation Index (SSCI) records.

Tüür-Fröhlich investigated articles from social science and law. The main findings: SSCI records contain tremendous amounts of “trivial errors”, not only misspellings and typos as previously mentioned in bibliometrics and scientometrics literature. But Tüür-Fröhlich's research documented fatal errors which have not been mentioned in the scientometrics literature yet at all. Tüür-Fröhlich found more than 80 fatal mutations and mutilations of Pierre Bourdieu (e.g. “Atkinson” or “Pierre, B. and “Pierri, B.”). SSCI even generated zombie references (phantom authors and works) by data fields’ confusion — a deadly sin for a database producer — as fragments of Patent Laws were indexed as fictional author surnames/initials. Additionally, horrific OCR-errors (e.g. “nuxure” instead of “Nature” as journal title) were identified.

Tüür-Fröhlich's extensive quantitative case study of an article of the Harvard Law Review resulted in a devastating finding: only 1% of all correct references from the original article were indexed by SSCI without any mistake or error. Many scientific communication experts and database providers' believe that errors in databanks are of less importance: There are many errors, yes — but they would counterbalance each other, errors would not result in citation losses and would not bear any effect on retrieval and evaluation outcomes. Terje Tüür-Fröhlich claims the contrary: errors and inconsistencies are not evenly distributed but linked with languages biases and publication cultures.


Aus dem Geleitwort:

In ihrer Dissertation hat Frau Terje Tüür-Fröhlich eigenständige Methoden entwickelt bzw. bestehende abgewandelt (sie nennt sie „Ping-Pong“ bzw. „Schneeball“). Ihre Resultate zeigen eindrucksvoll, dass im Zeitalter von Big Data und dem zunehmenden Glauben an Induktivismus und an die Automatisierbarkeit von Forschung die qualitative, intellektuelle Detailanalyse unersetzlich bleibt. Die schweren, von Tüür-Fröhlich entdeckten endogenen SSCI-Datenbankfehler — z. B. zahlreiche aus Fragmenten von Fußnoten kompilierte PhantomautorInnen bzw. Phantomwerke — hätten kaum durch die üblichen automatisierten Verfahren entdeckt werden können. Kritik ist ein unverzichtbares Element wissenschaftlicher Rationalität. Terje Tüür-Fröhlichs Dissertation ist eine Pionierarbeit.

From the Preface:

In her doctoral thesis Terje Tüür-Fröhlich has developed or modified further existing research methods (she calls them “ping-pong” and “snow-ball”). Her research results show clearly that in the age of Big Data and the ever increasing belief in inductivism and in the automation of research qualitative, intellectual, detailed analyses still remain indispensable. The severe endogenous SSCI database errors discovered by the author — e. g. numerous phantom authors or phantom works generated out of fragments of mixed footnotes — could hardly be detected by the usual automated analyses. Criticism is an essential element of scientific rationality. It cannot be replaced by automated big data analysis. Terje Tüür-Fröhlich’s doctoral dissertation is a pioneer work.

o. Univ.-Prof. Dr. Volker Gadenne
Department of Philosophy and Theory of Science
Johannes Kepler University (JKU) Linz / Austria


[= Schriften zur Informationswissenschaft; Bd. 69]

