26 Feb 2024 | Jack H. Culbert, Anne Hobert, Najko Jahn, Nick Haupka, Marion Schmidt, Paul Donner, Philipp Mayr
This study compares the reference and metadata coverage of OpenAlex with that of Web of Science (WoS) and Scopus. OpenAlex is an open-source scholarly metadata source that offers free access to bibliometric data, making it a potential alternative to proprietary databases. However, its data is rapidly evolving, and its reliability is still under scrutiny. The study uses a shared corpus of 16,788,282 publications from all three databases to assess their reference and metadata coverage. The results show that OpenAlex has comparable average source reference numbers and internal coverage to both WoS and Scopus. However, OpenAlex captures more ORCID identifiers, fewer abstracts, and a similar number of open access information per article compared to the other two databases.
The study also examines other metadata, such as abstracts, ORCIDs, and open access status. OpenAlex shows higher ORCID coverage than WoS and Scopus, but its abstract coverage is lower. The open access status information is sourced from Unpaywall, which is used by all three databases. The study finds that OpenAlex has a more linear distribution of open access information compared to WoS and Scopus, suggesting a possible indexing lag in the latter databases.
The study also identifies discrepancies between reported and pre-calculated reference counts in both OpenAlex and WoS. These discrepancies may be due to inconsistent data ingestion or the inclusion of deleted items in OpenAlex. The study concludes that while OpenAlex has comparable reference coverage to WoS and Scopus, it does not have the highest internal coverage. However, its large size suggests that it may have a higher proportion of referenced publications within its database. The study also highlights the importance of accurate reference matching and the need for further research into the differences in coverage and matching algorithms between the databases. The study also notes that OpenAlex has issues with author disambiguation, particularly for Chinese authors, which may affect the accuracy of ORCID assignments. Overall, the study suggests that OpenAlex is a viable alternative to proprietary databases for bibliometric research, but further improvements are needed to ensure its reliability and accuracy.This study compares the reference and metadata coverage of OpenAlex with that of Web of Science (WoS) and Scopus. OpenAlex is an open-source scholarly metadata source that offers free access to bibliometric data, making it a potential alternative to proprietary databases. However, its data is rapidly evolving, and its reliability is still under scrutiny. The study uses a shared corpus of 16,788,282 publications from all three databases to assess their reference and metadata coverage. The results show that OpenAlex has comparable average source reference numbers and internal coverage to both WoS and Scopus. However, OpenAlex captures more ORCID identifiers, fewer abstracts, and a similar number of open access information per article compared to the other two databases.
The study also examines other metadata, such as abstracts, ORCIDs, and open access status. OpenAlex shows higher ORCID coverage than WoS and Scopus, but its abstract coverage is lower. The open access status information is sourced from Unpaywall, which is used by all three databases. The study finds that OpenAlex has a more linear distribution of open access information compared to WoS and Scopus, suggesting a possible indexing lag in the latter databases.
The study also identifies discrepancies between reported and pre-calculated reference counts in both OpenAlex and WoS. These discrepancies may be due to inconsistent data ingestion or the inclusion of deleted items in OpenAlex. The study concludes that while OpenAlex has comparable reference coverage to WoS and Scopus, it does not have the highest internal coverage. However, its large size suggests that it may have a higher proportion of referenced publications within its database. The study also highlights the importance of accurate reference matching and the need for further research into the differences in coverage and matching algorithms between the databases. The study also notes that OpenAlex has issues with author disambiguation, particularly for Chinese authors, which may affect the accuracy of ORCID assignments. Overall, the study suggests that OpenAlex is a viable alternative to proprietary databases for bibliometric research, but further improvements are needed to ensure its reliability and accuracy.