2004 | Li Ding, Tim Finin, Anupam Joshi, Yun Peng, R. Scott Cost, Joel Sachs, Rong Pan, Pavan Reddivari, Vishal Doshi
Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, designed to handle Web documents in RDF or OWL. It extracts metadata for each discovered document and computes relations between documents. The system indexes documents using either character N-Gram or URIrefs as keywords to find relevant documents and compute similarity. One of the key features is the computation of rank, a measure of the importance of a Semantic Web document.
Swoogle is designed to automatically discover Semantic Web documents (SWDs), index their metadata, and answer queries about it. It distinguishes itself from other semantic web repositories and query systems by focusing on discovering SWDs rather than relying on annotations or user-submitted URLs. The system includes components for SWD discovery, metadata creation, data analysis, and interface. It also includes an algorithm called Ontology Rank, inspired by the PageRank algorithm, to rank hits returned by the retrieval engine.
Swoogle's architecture consists of a database storing metadata about SWDs, two distinct web crawlers for discovering SWDs, components for computing document metadata and semantic relationships, an N-Gram based indexing and retrieval engine, a user interface for querying the system, and APIs for providing services. The system supports rich query constraints on semantic relations and enables users to search for SWDs based on various criteria, including type, ontology ratio, and rank.
Swoogle's metadata includes basic metadata such as language features, RDF statistics, and ontology annotations, as well as relations among SWDs. The system also computes ranks of SWDs using a "rational random surfing model" that accounts for the various types of links between SWDs. The current version of Swoogle has discovered and analyzed over 11,000 semantic web documents and is designed to support millions of documents in future versions. The system is an ongoing project undergoing constant development and is available at http://swoogle.umbc.edu.Swoogle is a crawler-based indexing and retrieval system for the Semantic Web, designed to handle Web documents in RDF or OWL. It extracts metadata for each discovered document and computes relations between documents. The system indexes documents using either character N-Gram or URIrefs as keywords to find relevant documents and compute similarity. One of the key features is the computation of rank, a measure of the importance of a Semantic Web document.
Swoogle is designed to automatically discover Semantic Web documents (SWDs), index their metadata, and answer queries about it. It distinguishes itself from other semantic web repositories and query systems by focusing on discovering SWDs rather than relying on annotations or user-submitted URLs. The system includes components for SWD discovery, metadata creation, data analysis, and interface. It also includes an algorithm called Ontology Rank, inspired by the PageRank algorithm, to rank hits returned by the retrieval engine.
Swoogle's architecture consists of a database storing metadata about SWDs, two distinct web crawlers for discovering SWDs, components for computing document metadata and semantic relationships, an N-Gram based indexing and retrieval engine, a user interface for querying the system, and APIs for providing services. The system supports rich query constraints on semantic relations and enables users to search for SWDs based on various criteria, including type, ontology ratio, and rank.
Swoogle's metadata includes basic metadata such as language features, RDF statistics, and ontology annotations, as well as relations among SWDs. The system also computes ranks of SWDs using a "rational random surfing model" that accounts for the various types of links between SWDs. The current version of Swoogle has discovered and analyzed over 11,000 semantic web documents and is designed to support millions of documents in future versions. The system is an ongoing project undergoing constant development and is available at http://swoogle.umbc.edu.