July 9, 1997 | Robert Cooley, Bamshad Mobasher, Jaideep Srivastava
The paper "Web Mining: Information and Pattern Discovery on the World Wide Web" by Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava provides an overview of the field of web mining, which combines data mining and the World Wide Web. The authors define web mining and distinguish between two main dimensions: web content mining and web usage mining. Web content mining involves discovering and analyzing information from millions of sources on the web, while web usage mining focuses on mining web access logs and user browsing patterns. The paper discusses the challenges and techniques in web usage mining, including data preprocessing, transaction identification, and pattern discovery techniques such as association rules and sequential patterns. It also presents a general architecture for web usage mining and introduces the WEBMINER system, which implements parts of this architecture. The authors conclude by identifying future research directions, emphasizing the need for improved data preprocessing, mining algorithms, and analysis tools to better understand user behavior on the web.The paper "Web Mining: Information and Pattern Discovery on the World Wide Web" by Robert Cooley, Bamshad Mobasher, and Jaideep Srivastava provides an overview of the field of web mining, which combines data mining and the World Wide Web. The authors define web mining and distinguish between two main dimensions: web content mining and web usage mining. Web content mining involves discovering and analyzing information from millions of sources on the web, while web usage mining focuses on mining web access logs and user browsing patterns. The paper discusses the challenges and techniques in web usage mining, including data preprocessing, transaction identification, and pattern discovery techniques such as association rules and sequential patterns. It also presents a general architecture for web usage mining and introduces the WEBMINER system, which implements parts of this architecture. The authors conclude by identifying future research directions, emphasizing the need for improved data preprocessing, mining algorithms, and analysis tools to better understand user behavior on the web.