The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty

The Hearsay-II Speech-Understanding System: Integrating Knowledge to Resolve Uncertainty

June 1980 | LEE D. ERMAN, FREDERICK HAYES-ROTH, VICTOR R. LESSER, D. RAJ REDDY
The Hearsay-II speech-understanding system is a framework for solving speech understanding problems by integrating knowledge to resolve uncertainty. Developed during a five-year DARPA-sponsored research program, it represents both a specific solution to the speech-understanding problem and a general framework for coordinating independent processes to achieve cooperative problem-solving behavior. Speech understanding involves interpreting spoken sounds, which are the result of a long chain of transformations from intentions to acoustic waves. This process introduces ambiguity and uncertainty at each step. The Hearsay-II system reconstructs an intention from hypothetical interpretations at various levels of abstraction. It allocates limited processing resources to the most promising incremental actions. The system includes components to generate and evaluate speech hypotheses and a focus-of-control mechanism to identify the most valuable actions. Many of its procedures are novel approaches to speech problems, and the system successfully integrates and coordinates these activities to resolve uncertainty and control combinatorics. The system has been adapted to other problem domains, and it is anticipated that this trend will continue. The paper discusses the characteristics of the speech problem, the special kinds of problem-solving uncertainty in that domain, the structure of the Hearsay-II system, and its relationship to other speech-understanding systems. It is intended for the general computer science audience and assumes no prior knowledge of speech or artificial intelligence. The Hearsay-II system uses a blackboard architecture to coordinate knowledge sources (KSs), which are independent programs that generate, combine, and evaluate hypotheses. The blackboard records hypotheses generated by KSs and allows them to communicate and modify existing ones. The system's structure includes multiple levels of abstraction, with hypotheses at each level representing different aspects of the utterance. The system handles uncertainty by generating and evaluating numerous partial interpretations, which are then combined to form a complete interpretation. The credibility of each hypothesis is a measure of its consistency with the data and expectations. The system uses a heuristic scheduler to prioritize actions based on their potential to contribute to the overall goal of recognizing the utterance. The Hearsay-II system's architecture includes a scheduler that calculates priorities for each activity and executes the highest priority action. The system's ability to handle uncertainty and combinatorial explosion is achieved through selective attention and opportunistic problem-solving. The system's structure allows for flexible scheduling of KS actions in response to changing conditions on the blackboard. The Hearsay-II system is demonstrated through an example of recognizing an utterance, where it successfully hypothesizes several words and generates word-sequence hypotheses. The system's ability to handle uncertainty and integrate diverse knowledge sources makes it a powerful framework for solving complex problems cooperatively.The Hearsay-II speech-understanding system is a framework for solving speech understanding problems by integrating knowledge to resolve uncertainty. Developed during a five-year DARPA-sponsored research program, it represents both a specific solution to the speech-understanding problem and a general framework for coordinating independent processes to achieve cooperative problem-solving behavior. Speech understanding involves interpreting spoken sounds, which are the result of a long chain of transformations from intentions to acoustic waves. This process introduces ambiguity and uncertainty at each step. The Hearsay-II system reconstructs an intention from hypothetical interpretations at various levels of abstraction. It allocates limited processing resources to the most promising incremental actions. The system includes components to generate and evaluate speech hypotheses and a focus-of-control mechanism to identify the most valuable actions. Many of its procedures are novel approaches to speech problems, and the system successfully integrates and coordinates these activities to resolve uncertainty and control combinatorics. The system has been adapted to other problem domains, and it is anticipated that this trend will continue. The paper discusses the characteristics of the speech problem, the special kinds of problem-solving uncertainty in that domain, the structure of the Hearsay-II system, and its relationship to other speech-understanding systems. It is intended for the general computer science audience and assumes no prior knowledge of speech or artificial intelligence. The Hearsay-II system uses a blackboard architecture to coordinate knowledge sources (KSs), which are independent programs that generate, combine, and evaluate hypotheses. The blackboard records hypotheses generated by KSs and allows them to communicate and modify existing ones. The system's structure includes multiple levels of abstraction, with hypotheses at each level representing different aspects of the utterance. The system handles uncertainty by generating and evaluating numerous partial interpretations, which are then combined to form a complete interpretation. The credibility of each hypothesis is a measure of its consistency with the data and expectations. The system uses a heuristic scheduler to prioritize actions based on their potential to contribute to the overall goal of recognizing the utterance. The Hearsay-II system's architecture includes a scheduler that calculates priorities for each activity and executes the highest priority action. The system's ability to handle uncertainty and combinatorial explosion is achieved through selective attention and opportunistic problem-solving. The system's structure allows for flexible scheduling of KS actions in response to changing conditions on the blackboard. The Hearsay-II system is demonstrated through an example of recognizing an utterance, where it successfully hypothesizes several words and generates word-sequence hypotheses. The system's ability to handle uncertainty and integrate diverse knowledge sources makes it a powerful framework for solving complex problems cooperatively.
Reach us at info@study.space