5 February 2024 | Lin Zhang, Zhe Cao, Yuanyuan Shang, Gunnar Sivertsen, Ying Huang
The article "Missing institutions in OpenAlex: possible reasons, implications, and solutions" by Lin Zhang, Zhe Cao, Yuanyuan Shang, Gunnar Sivertsen, and Ying Huang investigates the issue of missing institutional information in the OpenAlex database. OpenAlex, launched in January 2022, is a fully open platform that integrates multiple data sources and offers easy data accessibility and broad data coverage. It has been widely used in quantitative science studies, including for the Leiden University ranking. However, the study finds that over 60% of journal articles in OpenAlex lack institutional information, particularly in early metadata and in the social sciences and humanities. The authors categorize institutional information into three types: full institutional information (FII), partially missing institutional information (PMII), and completely missing institutional information (CMII). They explore the reasons for this issue, its potential impact on research results, and possible solutions. The study aims to highlight the importance of improving data quality in open resources to support responsible use in scientific research.The article "Missing institutions in OpenAlex: possible reasons, implications, and solutions" by Lin Zhang, Zhe Cao, Yuanyuan Shang, Gunnar Sivertsen, and Ying Huang investigates the issue of missing institutional information in the OpenAlex database. OpenAlex, launched in January 2022, is a fully open platform that integrates multiple data sources and offers easy data accessibility and broad data coverage. It has been widely used in quantitative science studies, including for the Leiden University ranking. However, the study finds that over 60% of journal articles in OpenAlex lack institutional information, particularly in early metadata and in the social sciences and humanities. The authors categorize institutional information into three types: full institutional information (FII), partially missing institutional information (PMII), and completely missing institutional information (CMII). They explore the reasons for this issue, its potential impact on research results, and possible solutions. The study aims to highlight the importance of improving data quality in open resources to support responsible use in scientific research.