2000 | ANDREW KACHITES McCALLUM, KAMAL NIGAM, JASON RENNIE, KRISTIE SEYMORE
The paper "Automating the Construction of Internet Portals with Machine Learning" by Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymour discusses the growing popularity of domain-specific internet portals, which gather and organize web content for easy access and search. These portals, such as www.campussearch.com, offer advanced functionalities that general search engines lack. However, maintaining these portals is often labor-intensive and time-consuming. The authors propose using machine learning techniques to automate the creation and maintenance of such portals. They describe new research in reinforcement learning, information extraction, and text classification, which enable efficient spidering, identification of informative text segments, and the construction of topic hierarchies. Using these techniques, they have developed a demonstration system, a portal for computer science research papers, which contains over 50,000 papers and is publicly available. The paper highlights the efficiency and broad applicability of these techniques in creating portals for various domains.The paper "Automating the Construction of Internet Portals with Machine Learning" by Andrew Kachites McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymour discusses the growing popularity of domain-specific internet portals, which gather and organize web content for easy access and search. These portals, such as www.campussearch.com, offer advanced functionalities that general search engines lack. However, maintaining these portals is often labor-intensive and time-consuming. The authors propose using machine learning techniques to automate the creation and maintenance of such portals. They describe new research in reinforcement learning, information extraction, and text classification, which enable efficient spidering, identification of informative text segments, and the construction of topic hierarchies. Using these techniques, they have developed a demonstration system, a portal for computer science research papers, which contains over 50,000 papers and is publicly available. The paper highlights the efficiency and broad applicability of these techniques in creating portals for various domains.