The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub

The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub

5 Jun 2024 | Cailean Osborne, Jennifer Ding, Hannah Rose Kirk
This paper examines the development activity, social network structure, and model adoption on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating AI models. The study is divided into three main parts: 1. **Development Activity Analysis**: The analysis reveals that various types of activity, such as likes, downloads, discussions, and commits, exhibit right-skewed distributions across 348,181 model, 65,761 dataset, and 156,642 space repositories. Activity is highly imbalanced, with a small percentage of repositories accounting for a large majority of interactions. The choice of license significantly impacts collaboration patterns, with permissive licenses fostering the highest levels of activity. 2. **Social Network Structure Analysis**: The HF Hub developer community is characterized by a core-periphery structure, with a small group of prolific developers driving most of the collaboration. This network exhibits high reciprocity, indicating mutual relationships among developers. Despite the core-periphery structure, collaboration is not strongly linked to network centrality, suggesting that shared interests or project roles may be more significant factors. 3. **Model Adoption Analysis**: Model adoption in spaces on the HF Hub is also characterized by a right-skewed distribution, with a small number of models being widely used. A minority of developers, primarily from Big Tech companies, are responsible for the development of the most used models. The analysis also highlights the interdependencies and ecosystems surrounding popular models through co-usage patterns. The study concludes with recommendations for researchers, companies, and policymakers to advance the understanding of open AI development, emphasizing the need for more comprehensive research and evidence-based discussions on open source AI.This paper examines the development activity, social network structure, and model adoption on the Hugging Face (HF) Hub, a popular platform for building, sharing, and demonstrating AI models. The study is divided into three main parts: 1. **Development Activity Analysis**: The analysis reveals that various types of activity, such as likes, downloads, discussions, and commits, exhibit right-skewed distributions across 348,181 model, 65,761 dataset, and 156,642 space repositories. Activity is highly imbalanced, with a small percentage of repositories accounting for a large majority of interactions. The choice of license significantly impacts collaboration patterns, with permissive licenses fostering the highest levels of activity. 2. **Social Network Structure Analysis**: The HF Hub developer community is characterized by a core-periphery structure, with a small group of prolific developers driving most of the collaboration. This network exhibits high reciprocity, indicating mutual relationships among developers. Despite the core-periphery structure, collaboration is not strongly linked to network centrality, suggesting that shared interests or project roles may be more significant factors. 3. **Model Adoption Analysis**: Model adoption in spaces on the HF Hub is also characterized by a right-skewed distribution, with a small number of models being widely used. A minority of developers, primarily from Big Tech companies, are responsible for the development of the most used models. The analysis also highlights the interdependencies and ecosystems surrounding popular models through co-usage patterns. The study concludes with recommendations for researchers, companies, and policymakers to advance the understanding of open AI development, emphasizing the need for more comprehensive research and evidence-based discussions on open source AI.
Reach us at info@study.space