This paper explores the problem of automatically extracting keyphrases from text, which are phrases of two or more words that capture the main topics of a document. The author treats this as a supervised learning task, where the goal is to classify phrases as either positive (keyphrases) or negative examples. The paper evaluates two algorithms: C4.5, a general-purpose decision tree induction algorithm, and GenEx, a custom-designed algorithm specifically for keyphrase extraction. The experiments use five document collections and compare the performance of the algorithms by measuring the number of matches between machine-generated and human-generated keyphrases. The results show that GenEx, incorporating specialized procedural domain knowledge, outperforms C4.5 in generating keyphrases. Subjective human evaluation suggests that about 80% of the automatically generated keyphrases are acceptable, making the performance suitable for various practical applications. The paper also discusses related work, including automatic index generation and information extraction, and provides a detailed description of the GenEx algorithm, which combines the Genitor genetic algorithm and the Extractor keyphrase extraction algorithm.This paper explores the problem of automatically extracting keyphrases from text, which are phrases of two or more words that capture the main topics of a document. The author treats this as a supervised learning task, where the goal is to classify phrases as either positive (keyphrases) or negative examples. The paper evaluates two algorithms: C4.5, a general-purpose decision tree induction algorithm, and GenEx, a custom-designed algorithm specifically for keyphrase extraction. The experiments use five document collections and compare the performance of the algorithms by measuring the number of matches between machine-generated and human-generated keyphrases. The results show that GenEx, incorporating specialized procedural domain knowledge, outperforms C4.5 in generating keyphrases. Subjective human evaluation suggests that about 80% of the automatically generated keyphrases are acceptable, making the performance suitable for various practical applications. The paper also discusses related work, including automatic index generation and information extraction, and provides a detailed description of the GenEx algorithm, which combines the Genitor genetic algorithm and the Extractor keyphrase extraction algorithm.