Citation index



A citation index is a kind of bibliographic index, an index of citations between publications, allowing the user to easily establish which later documents cite which earlier documents. A form of citation index is first found in 12th-century Hebrew religious literature. Legal citation indexes are found in the 18th century and were made popular by citators such as Shepard's Citations (1873). In 1961, Eugene Garfield's Institute for Scientific Information (ISI) introduced the first citation index for papers published in academic journals, first the Science Citation Index (SCI), and later the Social Sciences Citation Index (SSCI) and the Arts and Humanities Citation Index (AHCI). American Chemical Society converted its printed Chemical Abstract Service (established in 1907) into internet-accessible SciFinder in 2008. The first automated citation indexing was done by CiteSeer in 1997 and was patented. Other sources for such data include Google Scholar, Microsoft Academic, Elsevier's Scopus, and the National Institutes of Health's iCite.

History
The earliest known citation index is an index of biblical citations in rabbinic literature, the Mafteah ha-Derashot, attributed to Maimonides and probably dating to the 12th century. It is organized alphabetically by biblical phrase. Later biblical citation indexes are in the order of the canonical text. These citation indices were used both for general and for legal study. The Talmudic citation index En Mishpat (1714) even included a symbol to indicate whether a Talmudic decision had been overridden, just as in the 19th-century Shepard's Citations. Unlike modern scholarly citation indexes, only references to one work, the Bible, were indexed.

In English legal literature, volumes of judicial reports included lists of cases cited in that volume starting with Raymond's Reports (1743) and followed by Douglas's Reports (1783). Simon Greenleaf (1821) published an alphabetical list of cases with notes on later decisions affecting the precedential authority of the original decision. These early tables of legal citations ("citators") were followed by a more complete, book length index, Labatt's Table of Cases...California... (1860) and in 1872 by Wait's Table of Cases...New York.... The most important and best-known citation index for legal cases was released in 1873 with the publication of Shepard's Citations.

William Adair, a former president of Shepard's Citations, suggested in 1920 that citation indexes could serve as a tool for tracking science and engineering literature. After learning that Eugene Garfield held a similar opinion, Adair corresponded with Garfield in 1953. The correspondence prompted Garfield to examine Shepard's Citations index as a model that could be extended to the sciences. Two years later Garfield published "Citation indexes for science" in the journal Science. In 1959, Garfield started a consulting business, the Institute for Scientific Information (ISI), in Philadelphia and began a correspondence with Joshua Lederberg about the idea. In 1961 Garfield received a grant from the U.S. National Institutes of Health to compile a citation index for Genetics. To do so, Garfield's team gathered 1.4 million citations from 613 journals. From this work, Garfield and the ISI produced the first version of the Science Citation Index, published as a book in 1963.

Major citation indexing services
General-purpose, subscription-based academic citation indexes include: Each of these offer an index of citations between publications and a mechanism to establish which documents cite which other documents. They are not open-access and differ widely in cost: Web of Science and Scopus are available by subscription (generally to libraries).
 * Web of Science by Clarivate Analytics (previously the Intellectual Property and Science business of Thomson Reuters)
 * Scopus by Elsevier, available online only, which similarly combines subject searching with citation browsing and tracking in the sciences and social sciences.

In addition, CiteSeer and Google Scholar are freely available online.

Several open-access, subject-specific citation indexing services also exist, such as:
 * INSPIRE-HEP which covers high energy physics,
 * PubMed, which covers life sciences and biomedical topics, and
 * Astrophysics Data System which covers astronomy and physics.

Representativeness of proprietary databases
Clarivate Analytics' Web of Science (WoS) and Elsevier's Scopus databases are synonymous with data on international research, and considered as the two most trusted or authoritative sources of bibliometric data for peer-reviewed global research knowledge across disciplines. They are both also used widely for the purposes of researcher evaluation and promotion, institutional impact (for example the role of WoS in the UK Research Excellence Framework 2021 ), and international league tables (Bibliographic data from Scopus represents more than 36% of assessment criteria in the THE rankings ). But while these databases are generally agreed to contain rigorously-assessed, high quality research, they do not represent the sum of current global research knowledge.

It is often mentioned in popular science articles that the research output of countries in South America, Asia, and Africa are disappointingly low. Sub-Saharan Africa is cited as an example for having "13.5% of the global population but less than 1% of global research output". This fact is based on data from a World Bank/Elsevier report from 2012 which relies on data from Scopus. Research outputs in this context refers to papers specifically published in peer-reviewed journals that are indexed in Scopus. Similarly, many others have analysed putatively global or international collaborations and mobility using the even more selective WoS database. Research outputs in this context refers to papers specifically published in peer-reviewed journals that are indexed either in Scopus or WoS.

Both WoS and Scopus are considered highly selective. Both are commercial enterprises, whose standards and assessment criteria are mostly controlled by panels in North America and Western Europe. The same is true for more comprehensive databases such as Ulrich's Web which lists as many as 70,000 journals, while Scopus has fewer than 50% of these, and WoS has fewer than 25%. While Scopus is larger and geographically broader than WoS, it still only covers a fraction of journal publishing outside North America and Europe. For example, it reports a coverage of over 2,000 journals in Asia ("230% more than the nearest competitor"), which may seem impressive until you consider that in Indonesia alone there are more than 7,000 journals listed on the government's Garuda portal (of which more than 1,300 are currently listed on DOAJ); whilst at least 2,500 Japanese journals listed on the J-Stage platform. Similarly, Scopus claims to have about 700 journals listed from Latin America, in comparison with SciELO's 1,285 active journal count; but that is just the tip of the iceberg judging by the 1,300+ DOAJ-listed journals in Brazil alone. Furthermore, the editorial boards of the journals contained in Wos and Scopus databases are integrated by researchers from western Europe and North America. For example, in the journal Human Geography, 41% of editorial board members are from the United States, and 37.8% from the UK. Similarly, ) studied ten leading marketing journals in WoS and Scopus databases, and concluded that 85.3% of their editorial board members are based in the United States. It comes as no surprise that the research that gets published in these journals is the one that fits the editorial boards' world view.

Comparison with subject-specific indexes has further revealed the geographical and topic bias – for example Ciarli found that by comparing the coverage of rice research in CAB Abstracts (an agriculture and global health database) with WoS and Scopus, the latter "may strongly under-represent the scientific production by developing countries, and over-represent that by industrialised countries", and this is likely to apply to other fields of agriculture. This under-representation of applied research in Africa, Asia, and South America may have an additional negative effect on framing research strategies and policy development in these countries. The overpromotion of these databases diminishes the important role of "local" and "regional" journals for researchers who want to publish and read locally-relevant content. Some researchers deliberately bypass "high impact" journals when they want to publish locally useful or important research in favour of outlets that will reach their key audience quicker, and in other cases to be able to publish in their native language.

Furthermore, the odds are stacked against researchers for whom English is a foreign language. 95% of WoS journals are English consider the use of English language a hegemonic and unreflective linguistic practice. The consequences include that non-native speakers spend part of their budget on translation and correction and invest a significant amount of time and effort on subsequent corrections, making publishing in English a burden. A far-reaching consequence of the use of English as the lingua franca of science is in knowledge production, because its use benefits "worldviews, social, cultural, and political interests of the English-speaking center" ( p. 123).

The small proportion of research from South East Asia, Africa, and Latin America which makes it into WoS and Scopus journals is not attributable to a lack of effort or quality of research; but due to hidden and invisible epistemic and structural barriers (Chan 2019 ). These are a reflection of "deeper historical and structural power that had positioned former colonial masters as the centers of knowledge production, while relegating former colonies to peripheral roles" (Chan 2018 ). Many North American and European journals demonstrate conscious and unconscious bias against researchers from other parts of the world. Many of these journals call themselves "international" but represent interests, authors, and even references only in their own languages. Therefore, researchers in non-European or North American countries commonly get rejected because their research is said to be "not internationally significant" or only of "local interest" (the wrong "local"). This reflects the current concept of "international" as limited to a Euro/Anglophone-centric way of knowledge production. In other words, "the ongoing internationalisation has not meant academic interaction and exchange of knowledge, but the dominance of the leading Anglophone journals in which international debates occurs and gains recognition".

Clarivate Analytics have made some positive steps to broaden the scope of WoS, integrating the SciELO citation index – a move not without criticism – and through the creation of the Emerging Sources Index (ESI), which has allowed database access to many more international titles. However, there is still a lot of work to be done to recognise and amplify the growing body of research literature generated by those outside North America and Europe. The Royal Society have previously identified that "traditional metrics do not fully capture the dynamics of the emerging global science landscape", and that academia needs to develop more sophisticated data and impact measures to provide a richer understanding of the global scientific knowledge that is available to us.

Academia has not yet built digital infrastructures which are equal, comprehensive, multi-lingual and allows fair participation in knowledge creation. One way to bridge this gap is with discipline- and region-specific preprint repositories such as AfricArXiv and InarXiv. Open access advocates recommend to remain critical of those "global" research databases that have been built in Europe or Northern America and be wary of those who celebrate these products act as a representation of the global sum of human scholarly knowledge. Finally, let us also be aware of the geopolitical impact that such systematic discrimination has on knowledge production, and the inclusion and representation of marginalised research demographics within the global research landscape.