The Structure of Collaborative Tagging Systems

The Structure of Collaborative Tagging Systems

| Scott A. Golder and Bernardo A. Huberman
Collaborative tagging systems allow users to add metadata in the form of keywords to shared content, becoming popular on the web. This paper analyzes the structure and dynamics of such systems, revealing regularities in user activity, tag frequencies, and the stability of tag proportions within URLs. A dynamical model is presented that predicts these stable patterns, linking them to imitation and shared knowledge. Collaborative tagging differs from traditional taxonomies by being non-hierarchical and inclusive, allowing for multiple, overlapping categories. Tags can serve various functions, such as identifying content, refining categories, or expressing personal opinions. Despite challenges like polysemy, synonymy, and basic level variation, tags can effectively organize information. Del.icio.us, a collaborative tagging system, was analyzed to uncover patterns in user behavior and tag usage. Users vary in their frequency and nature of use, with some creating many bookmarks and tags, while others use the system less frequently. Tag lists grow over time as users discover new interests. However, some tags may be introduced late, making it difficult to retroactively apply them to previous bookmarks. Tags often serve to identify the content of a bookmark, its type, or its owner. They can also refine categories or express personal opinions. The first tags used in a bookmark tend to be more general and widely agreed upon, reflecting basic levels of categorization. URLs often experience a burst in popularity, which can be self-sustaining due to their visibility on the "popular" page. However, the initial cause of such bursts is often external, such as mentions on popular websites. Despite this, the proportions of tags used in bookmarks tend to stabilize over time, even with a relatively small number of bookmarks. This stability can be explained by a stochastic urn model, where the frequency of tags becomes stable over time. The stability of tag proportions suggests that collaborative tagging can lead to a consensus in classification, even with diverse user preferences. This consensus can be useful for organizing and describing web content, even if the tagging is primarily for personal use. The paper concludes that collaborative tagging systems, while often used for personal purposes, can still provide valuable information for others. The stability of tag proportions and the potential for consensus in classification highlight the utility of collaborative tagging in organizing and sharing information on the web.Collaborative tagging systems allow users to add metadata in the form of keywords to shared content, becoming popular on the web. This paper analyzes the structure and dynamics of such systems, revealing regularities in user activity, tag frequencies, and the stability of tag proportions within URLs. A dynamical model is presented that predicts these stable patterns, linking them to imitation and shared knowledge. Collaborative tagging differs from traditional taxonomies by being non-hierarchical and inclusive, allowing for multiple, overlapping categories. Tags can serve various functions, such as identifying content, refining categories, or expressing personal opinions. Despite challenges like polysemy, synonymy, and basic level variation, tags can effectively organize information. Del.icio.us, a collaborative tagging system, was analyzed to uncover patterns in user behavior and tag usage. Users vary in their frequency and nature of use, with some creating many bookmarks and tags, while others use the system less frequently. Tag lists grow over time as users discover new interests. However, some tags may be introduced late, making it difficult to retroactively apply them to previous bookmarks. Tags often serve to identify the content of a bookmark, its type, or its owner. They can also refine categories or express personal opinions. The first tags used in a bookmark tend to be more general and widely agreed upon, reflecting basic levels of categorization. URLs often experience a burst in popularity, which can be self-sustaining due to their visibility on the "popular" page. However, the initial cause of such bursts is often external, such as mentions on popular websites. Despite this, the proportions of tags used in bookmarks tend to stabilize over time, even with a relatively small number of bookmarks. This stability can be explained by a stochastic urn model, where the frequency of tags becomes stable over time. The stability of tag proportions suggests that collaborative tagging can lead to a consensus in classification, even with diverse user preferences. This consensus can be useful for organizing and describing web content, even if the tagging is primarily for personal use. The paper concludes that collaborative tagging systems, while often used for personal purposes, can still provide valuable information for others. The stability of tag proportions and the potential for consensus in classification highlight the utility of collaborative tagging in organizing and sharing information on the web.
Reach us at info@study.space